About Our Group's Notebook:¶

We introduce 9 steps to implement this notebook. New users can sequentially go through these steps, then they are able to implement this notebook successfully.

Our Group's goal for this Notebook:¶

Our group's goal was to create a straightforward notebook, ensuring that new users only need to implement this notebook to understand how the ClimSim Datasets works.

What info could be helpful for future new visitors to the ClimSim GitHub Page?¶

  1. Mounted Google Drive in this Notebook for uploading Data to easily load the data Since the page introducing the ClimSim dataset is scattered across the GitHub repo, GitHub io, and Hugging Face, it's insufficient for intuitive understanding all at once -> "integrated information from three pages in this notebook" will be intuitive and help in easier comprehension.

  2. Added comment '''TODO''' so that new users can implement quickstart notebook with only changing minimum lines of code by just changing '''TODO''' commented parts. (already uploaded with using relative path)

  3. Added detailed explanation/ available options about the processes to easily understand the processes in this Notebook

  4. Explained that the structure of the dataset and specifies that subsampled data is used in the baseline model, making it easier to understand how the data loading process more intuitively.

  5. Information about the ClimSim Repo structure is on the GitHub io, but since useful information is scattered across three different sites, we uploaded it as images in this notebook for a more intuitive understanding.

How we made it easier for ML researchers to use ClimSim Data¶

  1. Added code guidelines for creating virtual environment and implementing it, so that users can avoid implementing errors due to environment setting conflicts.
  2. Showed steps to load existing baseline models in this notebook.
  3. Implemented simpler version of CNN Model so that users only need to understand this notebook to implement existing models
  4. Addressed M1 Chip compatibility issues
In [130]:
from IPython.display import Image, display
In [137]:
# Define the path to the image
image_path = "../../Project3-ClimSim-Fall2023-Group-5/figs/"
img0 = Image(filename=image_path + "(img0)ClimSim_GitHub_Repo_Structure.png")
img1 = Image(filename=image_path + "(img1)ClimSim_Repository_Main.png")
img2 = Image(filename=image_path + "(img2)ClimSim_Repository_clone_options.png")
img3 = Image(filename=image_path + "(img3)ClimSim_Models.png")
img4 = Image(filename=image_path + "(img4)CNN_M1_error.png")
img5 = Image(filename=image_path + "(img5)ClimSim_Save_Models.png")
In [151]:
img_val = Image(filename=image_path + "val_output.png")
img_scoring = Image(filename=image_path + "scoring_output.png")
In [ ]:
 

Now, Our Notebook begins!

ClimSim - Quickstart Guide Jupyter Notebook¶

This ClimSim_quickstart_Group5.ipynbfile is a notebook for climsim quick-start by Group 5. You can run this file to quickly master the climsim -- an ongoing machine learning research project of LEAP Center.

If you need more information, visit the following GitHub IO page: https://leap-stc.github.io/ClimSim/README.html

Step 1. Cloning the ClimSim Repository¶

First, you need to clone the ClimSim GitHub Repository to your local machine.

ClimSim GitHub Repository url : https://github.com/leap-stc/ClimSim

It might be not easy to understand this GitHub Repo at a first glance.

Here's the GitHub Repo's structure: https://leap-stc.github.io/ClimSim/ARCHITECTURE.html

In [138]:
display(img0)

To clone the ClimSim GitHub Repo, first go to the ClimSim's GitHub Repository.

In [142]:
display(img1)

You have three options for cloing the ClimSim Repo

In [143]:
display(img2)

1) running the following command on your local macine¶

In [ ]:
!git clone https://github.com/leap-stc/ClimSim.git

2) Clone with GitHub Desktop¶

Click <Code> button and then click Open with GitHub Desktop. Then select the location that you want to clone this Repo.

3) Download ZIP file for the ClimSim Repo¶

You can directly download ClimSim-main.zip file.

It may be better for you to change unzipped file name as ClimSim for easier understanding of the steps

To make sure you understand the rest of the steps clear, this is one example of the directory of ClimSim Repo (in Mac) : /Users/yoojin/Documents/GitHub/ClimSim/

Before Step 2 : Checking the Current Working Directory¶

Before going on to Step 2, make sure you open this Jupyter Notebook in ClimSim Repo : This Jupyter Notebook locates in ClimSim Repo: ClimSim/demo_notebooks/quickstart_example.ipynb

In [1]:
import os

# Check the current working directory 
current_directory = os.getcwd()

print("Current Working directory :", current_directory)
Current Working directory : /Users/yoojin/Documents/GitHub/ClimSim/demo_notebooks

As above, current working directory should be in ClimSim GitHub Repository in your local settings.

Step 2. Installation of ClimSim Python packages¶

First, for create the virtual environment on your own exclusively for ClimSim project.

In [ ]:
''' 
TODO: Set up the virtual Environment, and then activate it.
'''
In [ ]:
# create new virtual environment for ClimSim project
!conda create -n climsim_env python=3.9
In [ ]:
Then, activate the virtual environment.
In [ ]:
# For Windows
!climsim_env\Scripts\activate
In [ ]:
# For Mac
!source climsim_env/bin/activate

Install the dependencies using conda install

In [ ]:
!conda install -c conda-forge jupyter numexpr bottleneck pandas numpy tensorflow xarray scikit-learn
!conda install -c pytorch pytorch torchvision torchaudio

!conda install pytorch torchvision torchaudio -c pytorch
In [ ]:
!pip install -U scikit-learn
In [ ]:
!pip install tensorflow
In [ ]:
!pip install notebook

From Step 1, you successfully cloned the ClimSim GitHub Repo. Now in Step 2 you need to install climsim_utils Python packages using pip.

In [8]:
# Check the current working directory 
current_directory = os.getcwd()

print("Current Working directory :", current_directory)
Current Working directory : /Users/yoojin/Documents/GitHub/ClimSim/demo_notebooks

One thing important here is, you need to run pip installation code from the root of the ClimSim GitHub Repository (on your local machine). Make sure you the current working directory is ClimSim Repository on your local.

In [ ]:
''' 
TODO: Check the current working directory and make sure to navgivate up one level : to ClimSim directory
'''
In [3]:
# Going back to ClimSim directory 
%cd ../
/Users/yoojin/Documents/GitHub/ClimSim

In your current working directory as ClimSim, execute the following command to install ClimSim Python packages using pip.

In [13]:
!pip install .
Processing /Users/yoojin/Documents/GitHub/ClimSim
  Preparing metadata (setup.py) ... done
Collecting xarray (from climsim-utils==0.0.1)
  Downloading xarray-2023.10.1-py3-none-any.whl.metadata (10 kB)
Collecting numpy (from climsim-utils==0.0.1)
  Downloading numpy-1.26.1-cp310-cp310-macosx_10_9_x86_64.whl.metadata (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.2/61.2 kB 2.7 MB/s eta 0:00:00
Collecting pandas (from climsim-utils==0.0.1)
  Downloading pandas-2.1.2-cp310-cp310-macosx_10_9_x86_64.whl.metadata (18 kB)
Collecting matplotlib (from climsim-utils==0.0.1)
  Downloading matplotlib-3.8.0-cp310-cp310-macosx_10_12_x86_64.whl.metadata (5.8 kB)
Collecting tensorflow (from climsim-utils==0.0.1)
  Downloading tensorflow-2.14.0-cp310-cp310-macosx_10_15_x86_64.whl.metadata (3.9 kB)
Collecting netCDF4 (from climsim-utils==0.0.1)
  Downloading netCDF4-1.6.5-cp310-cp310-macosx_10_9_x86_64.whl.metadata (1.8 kB)
Collecting h5py (from climsim-utils==0.0.1)
  Downloading h5py-3.10.0-cp310-cp310-macosx_10_9_x86_64.whl.metadata (2.5 kB)
Collecting tqdm (from climsim-utils==0.0.1)
  Using cached tqdm-4.66.1-py3-none-any.whl.metadata (57 kB)
Collecting contourpy>=1.0.1 (from matplotlib->climsim-utils==0.0.1)
  Downloading contourpy-1.1.1-cp310-cp310-macosx_10_9_x86_64.whl.metadata (5.9 kB)
Collecting cycler>=0.10 (from matplotlib->climsim-utils==0.0.1)
  Using cached cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib->climsim-utils==0.0.1)
  Downloading fonttools-4.43.1-cp310-cp310-macosx_10_9_x86_64.whl.metadata (152 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 152.4/152.4 kB 10.2 MB/s eta 0:00:00
Collecting kiwisolver>=1.0.1 (from matplotlib->climsim-utils==0.0.1)
  Downloading kiwisolver-1.4.5-cp310-cp310-macosx_10_9_x86_64.whl.metadata (6.4 kB)
Collecting packaging>=20.0 (from matplotlib->climsim-utils==0.0.1)
  Downloading packaging-23.2-py3-none-any.whl.metadata (3.2 kB)
Collecting pillow>=6.2.0 (from matplotlib->climsim-utils==0.0.1)
  Downloading Pillow-10.1.0-cp310-cp310-macosx_10_10_x86_64.whl.metadata (9.5 kB)
Collecting pyparsing>=2.3.1 (from matplotlib->climsim-utils==0.0.1)
  Using cached pyparsing-3.1.1-py3-none-any.whl.metadata (5.1 kB)
Collecting python-dateutil>=2.7 (from matplotlib->climsim-utils==0.0.1)
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 kB 19.1 MB/s eta 0:00:00
Collecting cftime (from netCDF4->climsim-utils==0.0.1)
  Downloading cftime-1.6.3-cp310-cp310-macosx_10_9_x86_64.whl.metadata (8.6 kB)
Collecting certifi (from netCDF4->climsim-utils==0.0.1)
  Downloading certifi-2023.7.22-py3-none-any.whl.metadata (2.2 kB)
Collecting pytz>=2020.1 (from pandas->climsim-utils==0.0.1)
  Downloading pytz-2023.3.post1-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.1 (from pandas->climsim-utils==0.0.1)
  Downloading tzdata-2023.3-py2.py3-none-any.whl (341 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 341.8/341.8 kB 27.7 MB/s eta 0:00:00
Collecting absl-py>=1.0.0 (from tensorflow->climsim-utils==0.0.1)
  Using cached absl_py-2.0.0-py3-none-any.whl.metadata (2.3 kB)
Collecting astunparse>=1.6.0 (from tensorflow->climsim-utils==0.0.1)
  Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting flatbuffers>=23.5.26 (from tensorflow->climsim-utils==0.0.1)
  Using cached flatbuffers-23.5.26-py2.py3-none-any.whl.metadata (850 bytes)
Collecting gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 (from tensorflow->climsim-utils==0.0.1)
  Downloading gast-0.5.4-py3-none-any.whl (19 kB)
Collecting google-pasta>=0.1.1 (from tensorflow->climsim-utils==0.0.1)
  Downloading google_pasta-0.2.0-py3-none-any.whl (57 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.5/57.5 kB 9.0 MB/s eta 0:00:00
Collecting libclang>=13.0.0 (from tensorflow->climsim-utils==0.0.1)
  Using cached libclang-16.0.6-py2.py3-none-macosx_10_9_x86_64.whl.metadata (5.2 kB)
Collecting ml-dtypes==0.2.0 (from tensorflow->climsim-utils==0.0.1)
  Downloading ml_dtypes-0.2.0-cp310-cp310-macosx_10_9_universal2.whl.metadata (20 kB)
Collecting opt-einsum>=2.3.2 (from tensorflow->climsim-utils==0.0.1)
  Downloading opt_einsum-3.3.0-py3-none-any.whl (65 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.5/65.5 kB 8.1 MB/s eta 0:00:00
Collecting protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 (from tensorflow->climsim-utils==0.0.1)
  Using cached protobuf-4.24.4-cp37-abi3-macosx_10_9_universal2.whl.metadata (540 bytes)
Requirement already satisfied: setuptools in /Users/yoojin/opt/anaconda3/envs/climsim_env/lib/python3.10/site-packages (from tensorflow->climsim-utils==0.0.1) (68.0.0)
Collecting six>=1.12.0 (from tensorflow->climsim-utils==0.0.1)
  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting termcolor>=1.1.0 (from tensorflow->climsim-utils==0.0.1)
  Downloading termcolor-2.3.0-py3-none-any.whl (6.9 kB)
Collecting typing-extensions>=3.6.6 (from tensorflow->climsim-utils==0.0.1)
  Downloading typing_extensions-4.8.0-py3-none-any.whl.metadata (3.0 kB)
Collecting wrapt<1.15,>=1.11.0 (from tensorflow->climsim-utils==0.0.1)
  Downloading wrapt-1.14.1-cp310-cp310-macosx_10_9_x86_64.whl (35 kB)
Collecting tensorflow-io-gcs-filesystem>=0.23.1 (from tensorflow->climsim-utils==0.0.1)
  Downloading tensorflow_io_gcs_filesystem-0.34.0-cp310-cp310-macosx_10_14_x86_64.whl.metadata (14 kB)
Collecting grpcio<2.0,>=1.24.3 (from tensorflow->climsim-utils==0.0.1)
  Downloading grpcio-1.59.2.tar.gz (24.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.8/24.8 MB 35.3 MB/s eta 0:00:0000:0100:01
  Preparing metadata (setup.py) ... done
Collecting tensorboard<2.15,>=2.14 (from tensorflow->climsim-utils==0.0.1)
  Using cached tensorboard-2.14.1-py3-none-any.whl.metadata (1.7 kB)
Collecting tensorflow-estimator<2.15,>=2.14.0 (from tensorflow->climsim-utils==0.0.1)
  Using cached tensorflow_estimator-2.14.0-py2.py3-none-any.whl.metadata (1.3 kB)
Collecting keras<2.15,>=2.14.0 (from tensorflow->climsim-utils==0.0.1)
  Using cached keras-2.14.0-py3-none-any.whl.metadata (2.4 kB)
Requirement already satisfied: wheel<1.0,>=0.23.0 in /Users/yoojin/opt/anaconda3/envs/climsim_env/lib/python3.10/site-packages (from astunparse>=1.6.0->tensorflow->climsim-utils==0.0.1) (0.41.2)
Collecting google-auth<3,>=1.6.3 (from tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading google_auth-2.23.4-py2.py3-none-any.whl.metadata (4.7 kB)
Collecting google-auth-oauthlib<1.1,>=0.5 (from tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Using cached google_auth_oauthlib-1.0.0-py2.py3-none-any.whl (18 kB)
Collecting markdown>=2.6.8 (from tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading Markdown-3.5.1-py3-none-any.whl.metadata (7.1 kB)
Collecting requests<3,>=2.21.0 (from tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting tensorboard-data-server<0.8.0,>=0.7.0 (from tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Using cached tensorboard_data_server-0.7.2-py3-none-macosx_10_9_x86_64.whl.metadata (1.1 kB)
Collecting werkzeug>=1.0.1 (from tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading werkzeug-3.0.1-py3-none-any.whl.metadata (4.1 kB)
Collecting cachetools<6.0,>=2.0.0 (from google-auth<3,>=1.6.3->tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading cachetools-5.3.2-py3-none-any.whl.metadata (5.2 kB)
Collecting pyasn1-modules>=0.2.1 (from google-auth<3,>=1.6.3->tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading pyasn1_modules-0.3.0-py2.py3-none-any.whl (181 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 181.3/181.3 kB 8.3 MB/s eta 0:00:00
Collecting rsa<5,>=3.1.4 (from google-auth<3,>=1.6.3->tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading rsa-4.9-py3-none-any.whl (34 kB)
Collecting requests-oauthlib>=0.7.0 (from google-auth-oauthlib<1.1,>=0.5->tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading requests_oauthlib-1.3.1-py2.py3-none-any.whl (23 kB)
Collecting charset-normalizer<4,>=2 (from requests<3,>=2.21.0->tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading charset_normalizer-3.3.1-cp310-cp310-macosx_10_9_x86_64.whl.metadata (33 kB)
Collecting idna<4,>=2.5 (from requests<3,>=2.21.0->tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading idna-3.4-py3-none-any.whl (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.5/61.5 kB 4.4 MB/s eta 0:00:00
Collecting urllib3<3,>=1.21.1 (from requests<3,>=2.21.0->tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading urllib3-2.0.7-py3-none-any.whl.metadata (6.6 kB)
Collecting MarkupSafe>=2.1.1 (from werkzeug>=1.0.1->tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading MarkupSafe-2.1.3-cp310-cp310-macosx_10_9_x86_64.whl.metadata (3.0 kB)
Collecting pyasn1<0.6.0,>=0.4.6 (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading pyasn1-0.5.0-py2.py3-none-any.whl (83 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 83.9/83.9 kB 5.4 MB/s eta 0:00:00
Collecting oauthlib>=3.0.0 (from requests-oauthlib>=0.7.0->google-auth-oauthlib<1.1,>=0.5->tensorboard<2.15,>=2.14->tensorflow->climsim-utils==0.0.1)
  Downloading oauthlib-3.2.2-py3-none-any.whl (151 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 151.7/151.7 kB 15.4 MB/s eta 0:00:00
Downloading h5py-3.10.0-cp310-cp310-macosx_10_9_x86_64.whl (3.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/3.3 MB 37.1 MB/s eta 0:00:00a 0:00:01
Downloading numpy-1.26.1-cp310-cp310-macosx_10_9_x86_64.whl (20.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.6/20.6 MB 28.4 MB/s eta 0:00:0000:0100:01
Downloading matplotlib-3.8.0-cp310-cp310-macosx_10_12_x86_64.whl (7.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.6/7.6 MB 35.8 MB/s eta 0:00:0000:0100:01
Downloading netCDF4-1.6.5-cp310-cp310-macosx_10_9_x86_64.whl (7.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.5/7.5 MB 27.8 MB/s eta 0:00:00a 0:00:01
Downloading pandas-2.1.2-cp310-cp310-macosx_10_9_x86_64.whl (11.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.7/11.7 MB 35.6 MB/s eta 0:00:0000:010:01
Downloading tensorflow-2.14.0-cp310-cp310-macosx_10_15_x86_64.whl (229.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.6/229.6 MB 23.3 MB/s eta 0:00:0000:0100:01
Downloading ml_dtypes-0.2.0-cp310-cp310-macosx_10_9_universal2.whl (1.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 23.0 MB/s eta 0:00:0000:01
Using cached tqdm-4.66.1-py3-none-any.whl (78 kB)
Downloading xarray-2023.10.1-py3-none-any.whl (1.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 21.5 MB/s eta 0:00:0000:01
Using cached absl_py-2.0.0-py3-none-any.whl (130 kB)
Downloading contourpy-1.1.1-cp310-cp310-macosx_10_9_x86_64.whl (247 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.2/247.2 kB 13.5 MB/s eta 0:00:00
Using cached cycler-0.12.1-py3-none-any.whl (8.3 kB)
Using cached flatbuffers-23.5.26-py2.py3-none-any.whl (26 kB)
Downloading fonttools-4.43.1-cp310-cp310-macosx_10_9_x86_64.whl (2.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.2/2.2 MB 28.7 MB/s eta 0:00:0000:0100:01
Using cached keras-2.14.0-py3-none-any.whl (1.7 MB)
Downloading kiwisolver-1.4.5-cp310-cp310-macosx_10_9_x86_64.whl (68 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 68.1/68.1 kB 4.3 MB/s eta 0:00:00
Using cached libclang-16.0.6-py2.py3-none-macosx_10_9_x86_64.whl (24.5 MB)
Downloading packaging-23.2-py3-none-any.whl (53 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.0/53.0 kB 3.0 MB/s eta 0:00:00
Downloading Pillow-10.1.0-cp310-cp310-macosx_10_10_x86_64.whl (3.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.5/3.5 MB 36.5 MB/s eta 0:00:00a 0:00:01
Using cached protobuf-4.24.4-cp37-abi3-macosx_10_9_universal2.whl (409 kB)
Using cached pyparsing-3.1.1-py3-none-any.whl (103 kB)
Downloading pytz-2023.3.post1-py2.py3-none-any.whl (502 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 502.5/502.5 kB 21.4 MB/s eta 0:00:00
Using cached tensorboard-2.14.1-py3-none-any.whl (5.5 MB)
Using cached tensorflow_estimator-2.14.0-py2.py3-none-any.whl (440 kB)
Downloading tensorflow_io_gcs_filesystem-0.34.0-cp310-cp310-macosx_10_14_x86_64.whl (1.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 28.7 MB/s eta 0:00:0000:01
Downloading typing_extensions-4.8.0-py3-none-any.whl (31 kB)
Downloading certifi-2023.7.22-py3-none-any.whl (158 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 158.3/158.3 kB 11.4 MB/s eta 0:00:00
Downloading cftime-1.6.3-cp310-cp310-macosx_10_9_x86_64.whl (253 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 253.6/253.6 kB 13.3 MB/s eta 0:00:00
Downloading google_auth-2.23.4-py2.py3-none-any.whl (183 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 183.3/183.3 kB 12.9 MB/s eta 0:00:00
Downloading Markdown-3.5.1-py3-none-any.whl (102 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 102.2/102.2 kB 8.5 MB/s eta 0:00:00
Downloading requests-2.31.0-py3-none-any.whl (62 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 4.8 MB/s eta 0:00:00
Using cached tensorboard_data_server-0.7.2-py3-none-macosx_10_9_x86_64.whl (4.8 MB)
Downloading werkzeug-3.0.1-py3-none-any.whl (226 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 226.7/226.7 kB 12.1 MB/s eta 0:00:00
Downloading cachetools-5.3.2-py3-none-any.whl (9.3 kB)
Downloading charset_normalizer-3.3.1-cp310-cp310-macosx_10_9_x86_64.whl (120 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 120.2/120.2 kB 8.2 MB/s eta 0:00:00
Downloading MarkupSafe-2.1.3-cp310-cp310-macosx_10_9_x86_64.whl (13 kB)
Downloading urllib3-2.0.7-py3-none-any.whl (124 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 kB 7.4 MB/s eta 0:00:00
Building wheels for collected packages: climsim-utils, grpcio
  Building wheel for climsim-utils (setup.py) ... done
  Created wheel for climsim-utils: filename=climsim_utils-0.0.1-py3-none-any.whl size=14489 sha256=59f74a9ac4adc3bfd39fbda431e796af85e7c62c55c64f3d60f0596ae0269d09
  Stored in directory: /private/var/folders/pt/097n80157x964hszt0ngx3sc0000gn/T/pip-ephem-wheel-cache-8s9p5iyi/wheels/c8/c9/eb/87a8efe52c0385d4c8a0a32192d8e97d6491a98dd65f557c1b
  Building wheel for grpcio (setup.py) ... done
  Created wheel for grpcio: filename=grpcio-1.59.2-cp310-cp310-macosx_10_10_x86_64.whl size=4292205 sha256=ce4afc65b66392b5a02b9d8fb5c0b7517e45b806eeb899c323167309d84910ed
  Stored in directory: /Users/yoojin/Library/Caches/pip/wheels/cc/c1/40/d7f5b0c69ed82ba6c5e01048476631d80637640cc15c362565
Successfully built climsim-utils grpcio
Installing collected packages: pytz, libclang, flatbuffers, wrapt, urllib3, tzdata, typing-extensions, tqdm, termcolor, tensorflow-io-gcs-filesystem, tensorflow-estimator, tensorboard-data-server, six, pyparsing, pyasn1, protobuf, pillow, packaging, oauthlib, numpy, MarkupSafe, markdown, kiwisolver, keras, idna, grpcio, gast, fonttools, cycler, charset-normalizer, certifi, cachetools, absl-py, werkzeug, rsa, requests, python-dateutil, pyasn1-modules, opt-einsum, ml-dtypes, h5py, google-pasta, contourpy, cftime, astunparse, requests-oauthlib, pandas, netCDF4, matplotlib, google-auth, xarray, google-auth-oauthlib, tensorboard, tensorflow, climsim-utils
Successfully installed MarkupSafe-2.1.3 absl-py-2.0.0 astunparse-1.6.3 cachetools-5.3.2 certifi-2023.7.22 cftime-1.6.3 charset-normalizer-3.3.1 climsim-utils-0.0.1 contourpy-1.1.1 cycler-0.12.1 flatbuffers-23.5.26 fonttools-4.43.1 gast-0.5.4 google-auth-2.23.4 google-auth-oauthlib-1.0.0 google-pasta-0.2.0 grpcio-1.59.2 h5py-3.10.0 idna-3.4 keras-2.14.0 kiwisolver-1.4.5 libclang-16.0.6 markdown-3.5.1 matplotlib-3.8.0 ml-dtypes-0.2.0 netCDF4-1.6.5 numpy-1.26.1 oauthlib-3.2.2 opt-einsum-3.3.0 packaging-23.2 pandas-2.1.2 pillow-10.1.0 protobuf-4.24.4 pyasn1-0.5.0 pyasn1-modules-0.3.0 pyparsing-3.1.1 python-dateutil-2.8.2 pytz-2023.3.post1 requests-2.31.0 requests-oauthlib-1.3.1 rsa-4.9 six-1.16.0 tensorboard-2.14.1 tensorboard-data-server-0.7.2 tensorflow-2.14.0 tensorflow-estimator-2.14.0 tensorflow-io-gcs-filesystem-0.34.0 termcolor-2.3.0 tqdm-4.66.1 typing-extensions-4.8.0 tzdata-2023.3 urllib3-2.0.7 werkzeug-3.0.1 wrapt-1.14.1 xarray-2023.10.1
Note: you may need to restart the kernel to use updated packages.

Step 3. Importing necessary libraries/modules¶

In [1]:
import gc
import os
import numpy as np
import psutil
import torch
import torch.nn as nn
import torch.optim as optim
from torch.cuda.amp import autocast, GradScaler
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm
from torch.utils.checkpoint import checkpoint
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from tensorflow import keras
import string
import matplotlib.pyplot as plt
import pandas as pd
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
2023-11-01 09:41:53.825082: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

If you successfully completed step 2, you should be able to import the following Climsim python packages as well.

In [2]:
from climsim_utils.data_utils import *

Step 4. Instantiate class¶

In [ ]:
''' 
TODO: Check the current working directory and make sure you are in demo_notebooks directory
'''

Check the Current Working Directory from above, and make sure you are in demo_notebooks directory.

In [3]:
# Check the current working directory 
current_directory = os.getcwd()
print("Current Working directory :", current_directory)
Current Working directory : /Users/yoojin/Documents/GitHub/ClimSim/demo_notebooks

In step 4, you need to instantiate the data_utils Class. Instantiate and initialize the data as an instance of data_utils.

This data object plays a role in processing, managing, and transforming climate simulation data, among other tasks.

In [3]:
grid_path = '../grid_info/ClimSim_low-res_grid-info.nc'
norm_path = '../preprocessing/normalizations/'

grid_info = xr.open_dataset(grid_path)
input_mean = xr.open_dataset(norm_path + 'inputs/input_mean.nc')
input_max = xr.open_dataset(norm_path + 'inputs/input_max.nc')
input_min = xr.open_dataset(norm_path + 'inputs/input_min.nc')
output_scale = xr.open_dataset(norm_path + 'outputs/output_scale.nc')

data = data_utils(grid_info = grid_info, 
                  input_mean = input_mean, 
                  input_max = input_max, 
                  input_min = input_min, 
                  output_scale = output_scale)

# set variables to V1 subset
data.set_to_v1_vars()

Step 5. Download the DataSet¶

All ClimSim datasets are available on Hugging Face. (https://huggingface.co/LEAP)

ClimSim Datasets are divided into 4 main categories:

    1. LEAP/ClimSim_high-res : High-Resolution Real Geography
    1. LEAP/subsampled_low_res: Low-Resolution Real Geography
    1. LEAP/ClimSim_low-res_aqua-planet : Low-Resolution Aquaplanet

And

    1. LEAP/subsampled_low_res : the dataset used for the baseline model is this subsampled Low-Resolution Real GeoGraphy Dataset.

The data that needs to be uploaded in Step 5 is the fourth one located in LEAP/subsampled_low_res - the subsampled low-resolution data, whichi is subsampled from 3. LEAP/ClimSim_low-res_aqua-planet

From 4. LEAP/subsampled_low_res (https://huggingface.co/datasets/LEAP/subsampled_low_res/tree/main),

you need to download the following 4 files in your local (this does not have to be in ClimSim Repo):

  1. train_input.npy
  2. train_target.npy
  3. val_input.npy
  4. val_target.npy

Download these four files in your local. Make sure you set the downloaded_data_path to load the data in Step 6.

In [4]:
# Set the path to my Downloads directory
''' 
TODO: Change the file path according to your local environment. 

Skip this code cell .
And go to next(below) code cell if you are planning to load data in Google drive.
'''
downloaded_data_path = '/Users/yoojin/Downloads/'

OPTION) Using Google Drive Option to Save the Data in Google Drive¶

Instead of setting downloaded_data_pathin your local and download it in your local, we mounted Google Drive for easier process. To use Google Drive for loading the Dataset, you should load the whole repository into /content/drive/MyDrive/ path.

However, make sure you can only do this process running this Google Colab. You cannot process this step by running this notebook Jupyter Notebook.

You can easily do this by directly download ClimSim GitHub Repo (ClimSim-main.zip file) from Step 1 - option 3 .

Now, save these four data files into Google Drive in /content/drive/MyDrive/ path.

Now, let's mount Google Drive !

In [ ]:
# Only available in Google Colab
from google.colab import drive
drive.mount('/content/drive')

Then, all you need to do is change downloaded_data_path to: /content/drive/MyDrive/

In [ ]:
downloaded_data_path = '/content/drive/MyDrive/'

Then, you are all set for loading Datasets with using Google Drive! Everything stays the same.

Step 6. Load the Data¶

After you download these four files in downloaded_data_path , load these data to this Jupyter Notebook.

In [5]:
# Set the path for all four files
train_input_path = downloaded_data_path + 'train_input.npy'
train_target_path = downloaded_data_path + 'train_target.npy'
val_input_path = downloaded_data_path + 'val_input.npy'
val_target_path = downloaded_data_path + 'val_target.npy'

# Load four files
data.input_train = data.load_npy_file(train_input_path)
data.target_train = data.load_npy_file(train_target_path)
data.input_val = data.load_npy_file(val_input_path)
data.target_val = data.load_npy_file(val_target_path)

Step 7. Before Loading the Model:¶

Understand baseline models and its folders in ClimSim Repo¶

There are six different baseline models that were created and trained by ClimSim Team: 1) Convolutional neural network (CNN)

2) Encoder-decoder (ED)

3) Heteroskedastic regression (HSR)

4) Multi-layer perceptron (MLP)

5) Randomized prior network (RPN)

6) Conditional variational autoencoder (cVAE)

In /ClimSim/baseline-models, there are six subfolders for each model.

In [144]:
display(img3)

In each model's folder, there are three subfolders:

1)env, 2)model, 3)training folders.

1. In env sub-folder,

there is tf2.yml file that is used to define a Conda environment.

This YML file defines a Conda environment with a set of dependencies, including specific versions of Python packages and system libraries. So, you can check the environment dependencies with this file.

2. In model sub-folder,

there is a corresponding machine learning model file in .pb, h5, cp format. You can upload the model to load these baseline models.

Here's one example for uploading ED Model

e.g. Load the ED Model¶
In [7]:
# Set the model file's path that you are going to use among baseline models

# Check the current working directory 
current_directory = os.getcwd()
print("Current Working directory :", current_directory)

''' 
TODO: Change the file path according to your local environment
'''
model_path = '../baseline_models/ED/model/ED_ClimSIM_1_3_model.h5'
Current Working directory : /Users/yoojin/Documents/GitHub/ClimSim/demo_notebooks

You can load the model as loaded_model and load the data as input_data to pass it to the model for prediction or other operations.

In [8]:
model = keras.models.load_model(model_path)
2023-11-01 03:51:59.390712: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

3. In training sub-folder,

you will find Python files (.py) for model creation, storage, and training, as well as shell script files (.sh)

*Issue 1 that might occur in Step 7 - Load the Model *¶

There is an issue with loading CNN model.

Loading CNN model¶
In [9]:
####  CHANGE ####  ####  ####  ####  ####  ####  ####  ####  
# Check the current working directory 
current_directory = os.getcwd()
print("Current Working directory :", current_directory)
Current Working directory : /Users/yoojin/Documents/GitHub/ClimSim/demo_notebooks
In [10]:
''' 
TODO: Change the file path according to your local environment
'''
# to upload CNN model
cnn_model_path = '../baseline_models/CNN/model'

First trial) when using tf.keras.models.load_model()

ValueError: Unable to create a Keras model from SavedModel at /Users/yoojin/Documents/GitHub/ClimSim/baseline_models/CNN/model. This SavedModel was exported with tf.saved_model.save, and lacks the Keras metadata file. Please save your Keras model by calling model.save or tf.keras.models.save_model. Note that you can still load this SavedModel with tf.saved_model.load.

suggestion)

This warning occurs when loading a TensorFlow model, indicating that a model saved prior to TensorFlow 2.5 is being loaded.

This is an issue that arises when loading Keras models in TensorFlow 2.5 and above.

To resolve this problem, models should be saved in the appropriate manner for TensorFlow versions 2.5 and higher.

To address this, you can attempt to save the model using tf.keras.models.save_model.

However, since we couldn't find any code related to saving the model within the training folder, it may be better to explicitly add the code responsible for saving the model within the training script.

In [11]:
loaded_model_2 = tf.keras.models.load_model(cnn_model_path)
WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[11], line 1
----> 1 loaded_model_2 = tf.keras.models.load_model(cnn_model_path)

File ~/opt/anaconda3/envs/env3/lib/python3.9/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File ~/opt/anaconda3/envs/env3/lib/python3.9/site-packages/keras/saving/saved_model/load.py:220, in _read_legacy_metadata(object_graph_def, metadata, path)
    214 if (
    215     proto.WhichOneof("kind") == "user_object"
    216     and proto.user_object.identifier
    217     in constants.KERAS_OBJECT_IDENTIFIERS
    218 ):
    219     if not proto.user_object.metadata:
--> 220         raise ValueError(
    221             "Unable to create a Keras model from SavedModel at "
    222             f"{path}. This SavedModel was exported with "
    223             "`tf.saved_model.save`, and lacks the Keras metadata file. "
    224             "Please save your Keras model by calling `model.save` "
    225             "or `tf.keras.models.save_model`. Note that "
    226             "you can still load this SavedModel with "
    227             "`tf.saved_model.load`."
    228         )
    229     metadata.nodes.add(
    230         node_id=node_id,
    231         node_path=node_paths[node_id],
   (...)
    236         metadata=proto.user_object.metadata,
    237     )

ValueError: Unable to create a Keras model from SavedModel at ../baseline_models/CNN/model. This SavedModel was exported with `tf.saved_model.save`, and lacks the Keras metadata file. Please save your Keras model by calling `model.save` or `tf.keras.models.save_model`. Note that you can still load this SavedModel with `tf.saved_model.load`.

Second trial) when using tf.keras.models.load_model()

TypeError: init() takes from 2 to 4 positional arguments but 5 were given

In [34]:
loaded_model_ = tf.saved_model.load(cnn_model_path)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[34], line 1
----> 1 loaded_model_ = tf.saved_model.load(cnn_model_path)

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/load.py:800, in load(export_dir, tags, options)
    798 if isinstance(export_dir, os.PathLike):
    799   export_dir = os.fspath(export_dir)
--> 800 result = load_partial(export_dir, None, tags, options)["root"]
    801 return result

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/load.py:930, in load_partial(export_dir, filters, tags, options)
    928 with ops.init_scope():
    929   try:
--> 930     loader = Loader(object_graph_proto, saved_model_proto, export_dir,
    931                     ckpt_options, options, filters)
    932   except errors.NotFoundError as err:
    933     raise FileNotFoundError(
    934         str(err) + "\n You may be trying to load on a different device "
    935         "from the computational device. Consider setting the "
    936         "`experimental_io_device` option in `tf.saved_model.LoadOptions` "
    937         "to the io_device such as '/job:localhost'.")

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/load.py:154, in Loader.__init__(self, object_graph_proto, saved_model_proto, export_dir, ckpt_options, save_options, filters)
    151 self._proto = object_graph_proto
    152 self._export_dir = export_dir
    153 self._concrete_functions = (
--> 154     function_deserialization.load_function_def_library(
    155         library=meta_graph.graph_def.library,
    156         saved_object_graph=self._proto,
    157         wrapper_function=_WrapperFunction))
    158 # Store a set of all concrete functions that have been set up with
    159 # captures.
    160 self._restored_concrete_functions = set()

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/function_deserialization.py:405, in load_function_def_library(library, saved_object_graph, load_shared_name_suffix, wrapper_function)
    398 if (saved_object_graph is not None and
    399     orig_name in saved_object_graph.concrete_functions):
    400   # TODO(b/204324043): Offload the deserialization of the protos to the
    401   # first class objects by passing the actual protos. This is blocked on
    402   # importing `nested_structure_coder` in function.py causing a circular
    403   # dependency.
    404   proto = saved_object_graph.concrete_functions[orig_name]
--> 405   structured_input_signature = nested_structure_coder.decode_proto(
    406       proto.canonicalized_input_signature)
    407   structured_outputs = nested_structure_coder.decode_proto(
    408       proto.output_signature)
    410 # There is no need to copy all functions into the function def graph. It
    411 # leads to a O(n^2) increase of memory when importing functions and the
    412 # extra function definitions are a no-op since they already imported as a
    413 # function before and passed in explicitly (due to the topologic sort
    414 # import).

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/nested_structure_coder.py:135, in decode_proto(proto)
    122 @tf_export("__internal__.saved_model.decode_proto", v1=[])
    123 def decode_proto(proto):
    124   """Decodes proto representing a nested structure.
    125 
    126   Args:
   (...)
    133     NotEncodableError: For values for which there are no encoders.
    134   """
--> 135   return _map_structure(proto, _get_decoders())

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/nested_structure_coder.py:85, in _map_structure(pyobj, coders)
     83   if can(pyobj):
     84     recursion_fn = functools.partial(_map_structure, coders=coders)
---> 85     return do(pyobj, recursion_fn)
     86 raise NotEncodableError(
     87     f"No encoder for object {str(pyobj)} of type {type(pyobj)}.")

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/nested_structure_coder.py:195, in _TupleCodec.do_decode(self, value, decode_fn)
    194 def do_decode(self, value, decode_fn):
--> 195   return tuple(decode_fn(element) for element in value.tuple_value.values)

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/nested_structure_coder.py:195, in <genexpr>(.0)
    194 def do_decode(self, value, decode_fn):
--> 195   return tuple(decode_fn(element) for element in value.tuple_value.values)

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/nested_structure_coder.py:85, in _map_structure(pyobj, coders)
     83   if can(pyobj):
     84     recursion_fn = functools.partial(_map_structure, coders=coders)
---> 85     return do(pyobj, recursion_fn)
     86 raise NotEncodableError(
     87     f"No encoder for object {str(pyobj)} of type {type(pyobj)}.")

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/nested_structure_coder.py:195, in _TupleCodec.do_decode(self, value, decode_fn)
    194 def do_decode(self, value, decode_fn):
--> 195   return tuple(decode_fn(element) for element in value.tuple_value.values)

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/nested_structure_coder.py:195, in <genexpr>(.0)
    194 def do_decode(self, value, decode_fn):
--> 195   return tuple(decode_fn(element) for element in value.tuple_value.values)

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/nested_structure_coder.py:85, in _map_structure(pyobj, coders)
     83   if can(pyobj):
     84     recursion_fn = functools.partial(_map_structure, coders=coders)
---> 85     return do(pyobj, recursion_fn)
     86 raise NotEncodableError(
     87     f"No encoder for object {str(pyobj)} of type {type(pyobj)}.")

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/saved_model/nested_structure_coder.py:572, in _TypeSpecCodec.do_decode(self, value, decode_fn)
    569   type_spec_class = self.TYPE_SPEC_CLASS_FROM_PROTO[type_spec_class_enum]
    571 # pylint: disable=protected-access
--> 572 return type_spec_class._deserialize(decode_fn(type_spec_proto.type_state))

File ~/opt/anaconda3/envs/climsim_env_/lib/python3.9/site-packages/tensorflow/python/framework/type_spec.py:469, in TypeSpec._deserialize(cls, serialization)
    449 @classmethod
    450 def _deserialize(cls, serialization):
    451   """Reconstructs a TypeSpec from a value returned by `serialize`.
    452 
    453   Args:
   (...)
    467     A `TypeSpec` of type `cls`.
    468   """
--> 469   return cls(*serialization)

TypeError: __init__() takes from 2 to 4 positional arguments but 5 were given

*Issue 2 that might occur in Step 7 - Running python files for training with M1 Chip *¶

Error when loading the model - running python script in training folder causes error for Macbook M1 Chip¶

For CNN Model, when running hpo_train.py with the following command in ./ClimSim/baseline_models/CNN/training as follows:

In [ ]:
python hpo_train.py
In [145]:
display(img4)

For Macbook Pro M1, M2, Tensorflow support is still ongoing, so this happened.

You can run the following command to run Python for the x86 architecture using Rosetta 2:

In [ ]:
# Run hpo_train.py in "training" directory
arch -x86_64 python hpo_train.py

Roseta 2 is a translation layer for running x86 architecture applications on Mac M1. You can use Roestta 2 on the command line to run Python and packages for the x86 architecture. (For more possible solutoins, see: https://stackoverflow.com/questions/65383338/zsh-illegal-hardware-instruction-python-when-installing-tensorflow-on-macbook )

*Alternative solutions to address these issues : making simpler model *¶

To address these issues, our Group 5 has created a simpler CNN model that can be attached without the need for these folders and files

SimpleCNN¶

In [28]:
# Define a model, here we use a simple 1D convolutional network
class SimpleCNN(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv1d(input_dim, 128, kernel_size=3, padding=1)
        self.relu = nn.ReLU()
        self.fc = nn.Linear(128, output_dim)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu(x)
        x = torch.mean(x, dim=2)  # Global Average Pooling
        x = self.fc(x)
        return x

#Prepare the data
#Assume the shape of data.input_train is (10091520, 124), we need to reshape it to (10091520, 124, 1).
#Assume the shape of data.target_train is (10091520, <output_dim>).
input_train = np.expand_dims(data.input_train, axis=2)
target_train = data.target_train

# Convert the data to PyTorch tensors.
input_train = torch.tensor(input_train, dtype=torch.float32)
target_train = torch.tensor(target_train, dtype=torch.float32)

# Define the batch size and gradient accumulation steps.
batch_size = 64
epochs = 3
input_dim = input_train.shape[1]
output_dim = target_train.shape[1]

# Initialize the model, loss function, and optimizer.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SimpleCNN(input_dim, output_dim).to(device)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Create a data loader.
train_dataset = torch.utils.data.TensorDataset(input_train, target_train)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

# Training loop
for epoch in range(epochs):
    model.train()
    running_loss = 0.0
    for inputs, targets in tqdm(train_loader):
        inputs, targets = inputs.to(device), targets.to(device)

        optimizer.zero_grad()
        with autocast():
            outputs = model(inputs)
            loss = criterion(outputs, targets)

        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    print(f"Epoch {epoch+1}/{epochs}, Loss: {running_loss/len(train_loader):.4f}")

# Save the entire model after training is finished.
torch.save(model, "cnn_model.pth")
100%|██████████████████████████████████| 157680/157680 [09:28<00:00, 277.13it/s]
Epoch 1/3, Loss: 0.0045
100%|██████████████████████████████████| 157680/157680 [09:34<00:00, 274.31it/s]
Epoch 2/3, Loss: 0.0044
100%|██████████████████████████████████| 157680/157680 [09:58<00:00, 263.35it/s]
Epoch 3/3, Loss: 0.0043

ImprovedCNN¶

In [18]:
# Define model
class ImprovedCNN(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(ImprovedCNN, self).__init__()
        self.conv1 = nn.Conv1d(input_dim, 64, kernel_size=3, padding=1)
        self.bn1 = nn.BatchNorm1d(64)
        self.relu = nn.ReLU()
        self.maxpool1 = nn.MaxPool1d(kernel_size=2, stride=2, padding=1)  # 添加了padding
        self.dropout = nn.Dropout(0.5)

        self.conv2 = nn.Conv1d(64, 128, kernel_size=3, padding=1)
        self.bn2 = nn.BatchNorm1d(128)
        self.maxpool2 = nn.MaxPool1d(kernel_size=2, stride=2, padding=1)  # 添加了padding

        self.conv3 = nn.Conv1d(128, 256, kernel_size=3, padding=1)
        self.bn3 = nn.BatchNorm1d(256)
        self.adaptivepool = nn.AdaptiveAvgPool1d(1)

        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(256, 1024)
        self.dropout1 = nn.Dropout(0.5)
        self.relu1 = nn.ReLU()

        self.fc2 = nn.Linear(1024, 512)
        self.dropout2 = nn.Dropout(0.5)
        self.relu2 = nn.ReLU()

        self.output = nn.Linear(512, output_dim)

    def forward(self, x):
        x = self.maxpool1(self.relu(self.bn1(self.conv1(x))))
        x = self.maxpool2(self.relu(self.bn2(self.conv2(x))))
        x = self.adaptivepool(self.relu(self.bn3(self.conv3(x))))
        x = self.flatten(x)
        x = self.relu1(self.dropout1(self.fc1(x)))
        x = self.relu2(self.dropout2(self.fc2(x)))
        x = self.output(x)
        return x

# #Prepare the data
input_train = np.expand_dims(data.input_train, axis=2)
target_train = data.target_train
input_val = np.expand_dims(data.input_val, axis=2)
target_val = data.target_val

# Convert the data to PyTorch tensors.
input_train = torch.tensor(input_train, dtype=torch.float32)
target_train = torch.tensor(target_train, dtype=torch.float32)
input_val = torch.tensor(input_val, dtype=torch.float32)
target_val = torch.tensor(target_val, dtype=torch.float32)

# Define the batch size and number of epochs.
batch_size = 64
epochs = 3

# Initialize the model, loss function, and optimizer.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = ImprovedCNN(input_train.shape[1], target_train.shape[1]).to(device)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Create a data loader.
train_loader = DataLoader(TensorDataset(input_train, target_train), batch_size=batch_size, shuffle=True)
val_loader = DataLoader(TensorDataset(input_val, target_val), batch_size=batch_size, shuffle=False)

# Early stopping parameters.
patience = 5
best_val_loss = float('inf')
epochs_without_improvement = 0

# Training loop
for epoch in range(epochs):
    model.train()
    running_loss = 0.0
    for inputs, targets in tqdm(train_loader):
        inputs, targets = inputs.to(device), targets.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()

    print(f"Epoch {epoch+1}/{epochs}, Training Loss: {running_loss/len(train_loader):.4f}")

    # Validation phase.
    model.eval()
    running_val_loss = 0.0
    with torch.no_grad():
        for val_inputs, val_targets in val_loader:
            val_inputs, val_targets = val_inputs.to(device), val_targets.to(device)
            val_outputs = model(val_inputs)
            val_loss = criterion(val_outputs, val_targets)
            running_val_loss += val_loss.item()

    avg_val_loss = running_val_loss / len(val_loader)
    print(f"Epoch {epoch+1}/{epochs}, Validation Loss: {avg_val_loss:.4f}")

    # Early stopping check.
    if avg_val_loss < best_val_loss:
        best_val_loss = avg_val_loss
        epochs_without_improvement = 0
    else:
        epochs_without_improvement += 1
        if epochs_without_improvement >= patience:
            print(f"Early stopping after {epochs_without_improvement} epochs without improvement.")
            break

# Save the entire model after training is finished.
torch.save(model, "cnn_model.pth")
  0%|                                                | 0/157680 [00:00<?, ?it/s]Fatal Python error: config_get_locale_encoding: failed to get the locale encoding: nl_langinfo(CODESET) failed
Python runtime state: preinitialized

100%|███████████████████████████████████| 157680/157680 [38:50<00:00, 67.65it/s]
Epoch 1/3, Training Loss: 0.0051
Epoch 1/3, Validation Loss: 0.0045
100%|███████████████████████████████████| 157680/157680 [35:45<00:00, 73.48it/s]
Epoch 2/3, Training Loss: 0.0048
Epoch 2/3, Validation Loss: 0.0045
100%|███████████████████████████████████| 157680/157680 [35:25<00:00, 74.18it/s]
Epoch 3/3, Training Loss: 0.0048
Epoch 3/3, Validation Loss: 0.0045

* One suggestion for the training process : make sure to leave codes for saving the model *¶

Make sure to leave codes for saving the model in training folder python files.

e.g) CNN model - add code to save model file as my_model.h5 in hpo_training.py in /ClimSim/baseline_models/CNN/training folder

In [146]:
display(img5)

Step 8. Load the Model¶

Basic models -

  1. constant predicition model,
  2. multiple linear regression model

1. Train constant prediction model¶

$ \hat{y} = E[y_{train}] $

Constant prediction model: a model that consistently predicts a constant value for a specific dataset

In [20]:
const_model = data.target_train.mean(axis = 0)

2. Train multiple linear regression model¶

$\beta = {(X_{train}^TX_{train})}^{-1}X_{train}^Ty_{train}$

$\hat{y} = X_{input}^T \beta$

where $X_{train}$ and $X_{input}$ correspond to the training data and the input data you would like to inference on, respectively. $X_{train}$ and $X_{input}$ both have a column of ones concatenated to the feature space for the bias.

adding bias unit¶
In [21]:
X = data.input_train
bias_vector = np.ones((X.shape[0], 1))
X = np.concatenate((X, bias_vector), axis=1)
create model¶
In [22]:
mlr_weights = np.linalg.inv(X.transpose()@X)@X.transpose()@data.target_train
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
(suggestion)¶

You can revise calculation process as follows:

Instead of computing the inverse directly, we can use np.linalg.solve, which is more stable and efficient for solving systems of linear equations

In [14]:
XTX = X.transpose() @ X
XTy = X.transpose() @ data.target_train
mlr_weights = np.linalg.solve(XTX, XTy)
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
In [15]:
available_memory = psutil.virtual_memory().available
total_memory = psutil.virtual_memory().total
torch.cuda.empty_cache()

print(f"Available RAM: {available_memory / (1024**3):.2f} GB")
print(f"Total RAM: {total_memory / (1024**3):.2f} GB")
print(torch.cuda.is_available())
Available RAM: 3.99 GB
Total RAM: 32.00 GB
False

3. Load the pre-trained model¶

You can load the model you want in this part. You can upload one of six baseline models with uploading model file.

Or

You can also make your own model, and then load your model in this step.

In [12]:
# Check the current working directory 
current_directory = os.getcwd()

print("Current Working directory :", current_directory)
Current Working directory : /Users/yoojin/Documents/GitHub/ClimSim/demo_notebooks
In [19]:
### 
# TODO: Change the path of pre-trained model here
###
pretrained_model_path = "cnn_model.pth"

# Load the pre-trained model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.load(pretrained_model_path, map_location=device)
model.to(device)
model.eval()
Out[19]:
ImprovedCNN(
  (conv1): Conv1d(124, 64, kernel_size=(3,), stride=(1,), padding=(1,))
  (bn1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU()
  (maxpool1): MaxPool1d(kernel_size=2, stride=2, padding=1, dilation=1, ceil_mode=False)
  (dropout): Dropout(p=0.5, inplace=False)
  (conv2): Conv1d(64, 128, kernel_size=(3,), stride=(1,), padding=(1,))
  (bn2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (maxpool2): MaxPool1d(kernel_size=2, stride=2, padding=1, dilation=1, ceil_mode=False)
  (conv3): Conv1d(128, 256, kernel_size=(3,), stride=(1,), padding=(1,))
  (bn3): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (adaptivepool): AdaptiveAvgPool1d(output_size=1)
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (fc1): Linear(in_features=256, out_features=1024, bias=True)
  (dropout1): Dropout(p=0.5, inplace=False)
  (relu1): ReLU()
  (fc2): Linear(in_features=1024, out_features=512, bias=True)
  (dropout2): Dropout(p=0.5, inplace=False)
  (relu2): ReLU()
  (output): Linear(in_features=512, out_features=128, bias=True)
)

Step 8. Evaluate on validation data¶

Set pressure grid¶

Setting the pressure grid for validation data. Configuring the pressure grid to match the validation data.

In [23]:
data.set_pressure_grid(data_split = 'val')

Load predictions¶

1. Constant prediction model¶

In [24]:
const_pred_val = np.repeat(const_model[np.newaxis, :], data.target_val.shape[0], axis = 0)
print(const_pred_val.shape)
(1441920, 128)

2. Multiple Linear Regression model¶

In [25]:
X_val = data.input_val
bias_vector_val = np.ones((X_val.shape[0], 1))
X_val = np.concatenate((X_val, bias_vector_val), axis=1)
mlr_pred_val = X_val@mlr_weights
print(mlr_pred_val.shape)
(1441920, 128)

3. Your pre-trained model¶

Load predictions into data_utils object¶
In [27]:
# Prepare the input and target data.
input_val = np.expand_dims(data.input_val, axis=2)
input_val = torch.tensor(input_val, dtype=torch.float32).to(device)
target_val = torch.tensor(data.target_val, dtype=torch.float32).to(device)
In [28]:
# Use the loaded model to make predictions on the validation data.
with torch.no_grad():
    pred_val = model(input_val)

# Append the predictions for the new model to the preds list
preds.append(pred_val.cpu().numpy())    
In [29]:
# Save the prediction results for subsequent analysis.
np.save("pred_validation.npy", pred_val.cpu().numpy())

Weight predictions and target¶

  1. Undo output scaling

  2. Weight vertical levels by dp/g

  3. Weight horizontal area of each grid cell by a[x]/mean(a[x])

  4. Convert units to a common energy unit

In [31]:
# This part requires a specific re-weighting method
weighted_pred_val = pred_val
weighted_target_val = target_val
In [32]:
# Calculate various evaluation metrics.
mae = mean_absolute_error(weighted_target_val.cpu().numpy(), weighted_pred_val.cpu().numpy())
rmse = np.sqrt(mean_squared_error(weighted_target_val.cpu().numpy(), weighted_pred_val.cpu().numpy()))
r2 = r2_score(weighted_target_val.cpu().numpy(), weighted_pred_val.cpu().numpy())
bias = np.mean(weighted_pred_val.cpu().numpy() - weighted_target_val.cpu().numpy())

metrics = {
    'MAE': mae,
    'RMSE': rmse,
    'R2': r2,
    'Bias': bias
}

Create plots¶

Plot for our CNN Model

In [88]:
# Plot - Only for My Model 
target_val = target_val.numpy() if hasattr(target_val, 'numpy') else target_val
pred_val = pred_val.numpy() if hasattr(pred_val, 'numpy') else pred_val

group_names = ['dT/dt', 'dq/dt', 'NETSW', 'FLWDS', 'PRECSC', 'PRECC', 'SOLS', 'SOLL', 'SOLSD', 'SOLLD']
group_indices = [range(0,60), range(60,120), range(120,121), range(121,122), range(122,123), range(123,124), range(124,125), range(125,126), range(126,127), range(127,128)]

def compute_metrics(target, pred):
    diff = target - pred
    mae = np.mean(np.abs(diff))
    rmse = np.sqrt(np.mean(diff**2))
    corr_coeff = np.corrcoef(target.flatten(), pred.flatten())[0,1]
    r2 = corr_coeff**2
    bias = np.mean(diff)
    return mae, rmse, r2, bias

group_mae, group_rmse, group_r2, group_bias = [], [], [], []

for indices in group_indices:
    mae, rmse, r2, bias = compute_metrics(target_val[:, indices], pred_val[:, indices])
    group_mae.append(mae)
    group_rmse.append(rmse)
    group_r2.append(r2)
    group_bias.append(bias)

metrics_dict = {
    'Mean Absolute Error (MAE)': group_mae,
    'Root Mean Square Error (RMSE)': group_rmse,
    'R squared (R2)': group_r2,
    'Bias': group_bias
}

# 
for metric_name, metric_data in metrics_dict.items():
    fig, ax = plt.subplots(figsize=(10, 4))
    ax.bar(group_names, metric_data, color='blue', label='CNN')
    ax.set_title(metric_name)
    ax.legend()
    ax.set_xticks(group_names)
    ax.set_xticklabels(group_names, rotation=45, ha="right")
    plt.tight_layout()
    plt.savefig(f"(Val)One_model_output_{metric_name.replace(' ', '_')}.png", dpi=300)
    plt.show() 
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
In [83]:
# Plot - For All three models + save figures

def compute_metrics(target, pred):
    diff = target - pred
    mae = np.mean(np.abs(diff))
    rmse = np.sqrt(np.mean(diff**2))
    corr_coeff = np.corrcoef(target.flatten(), pred.flatten())[0,1]
    bias = np.mean(diff)
    return mae, rmse, corr_coeff**2, bias

group_indices = [list(range(0, 60)), list(range(60, 120)), [120], [121], [122], [123], [124], [125], [126], [127]]
metrics_functions = [compute_metrics]

labels = ['dT/dt', 'dq/dt', 'NETSW', 'FLWDS', 'PRECSC', 'PRECC', 'SOLS', 'SOLL', 'SOLSD', 'SOLLD']
colors = ['blue', 'orange', 'green']

# to save the result
all_mae, all_rmse, all_r2, all_bias = [], [], [], []

# calculate metrics for each group
for indices in group_indices:
    metrics_cnn = compute_metrics(target_val[:, indices], pred_val[:, indices])
    metrics_const = compute_metrics(target_val[:, indices], const_pred_val[:, indices])
    metrics_mlr = compute_metrics(target_val[:, indices], mlr_pred_val[:, indices])
    
    all_mae.append([metrics_cnn[0], metrics_const[0], metrics_mlr[0]])
    all_rmse.append([metrics_cnn[1], metrics_const[1], metrics_mlr[1]])
    all_r2.append([metrics_cnn[2], metrics_const[2], metrics_mlr[2]])
    all_bias.append([metrics_cnn[3], metrics_const[3], metrics_mlr[3]])


group_metrics = np.array([all_mae, all_rmse, all_r2, all_bias])  # Convert to numpy array
width = 0.2  # 각 막대의 너비
positions = np.arange(len(labels))  # 

metric_names = ['Mean Absolute Error (MAE)', 'Root Mean Square Error (RMSE)', 'R squared (R2)', 'Bias']

# Create and save each metric plot separately
for metric_idx, metric_name in enumerate(metric_names):
    fig, ax = plt.subplots(figsize=(10, 4))  # Create a new figure for each metric
    for model_idx, model_name in enumerate(['CNN', 'const', 'mlr']):
        bar_positions = [pos + model_idx * width for pos in positions]
        ax.bar(bar_positions, group_metrics[metric_idx][:, model_idx], color=colors[model_idx], width=width, label=model_name)
    ax.set_title(metric_name)
    ax.legend()
    ax.set_xticks(positions + width)
    ax.set_xticklabels(labels, rotation=45, ha="right")
    plt.tight_layout()
    plt.savefig(f"(Val)models_output_plot_{metric_name.replace(' ', '_')}.png", dpi=300)
    plt.show() 
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
/Users/yoojin/opt/anaconda3/envs/env3/lib/python3.9/site-packages/numpy/lib/function_base.py:2897: RuntimeWarning: invalid value encountered in divide
  c /= stddev[:, None]
/Users/yoojin/opt/anaconda3/envs/env3/lib/python3.9/site-packages/numpy/lib/function_base.py:2898: RuntimeWarning: invalid value encountered in divide
  c /= stddev[None, :]
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
In [152]:
display(img_val)

If you trained models with different hyperparameters, use the ones that performed the best on validation data for evaluation on scoring data.

Step 9. Evaluate on scoring data¶

Evaluate on scoring data¶

Do this at the VERY END (when you have finished tuned the hyperparameters for your model and are seeking a final evaluation)¶

You also need to download scoring dataset .

From 4. LEAP/subsampled_low_res (https://huggingface.co/datasets/LEAP/subsampled_low_res/tree/main),

you need to download the following 4 files in your local (this does not have to be in ClimSim Repo):

  1. scroing_input.npy
  2. scoring_target.npy

Download these two files in your local. Make sure you set the downloaded_data_path to load the data in Step 6.

If you used Google Drive to load the data, you can just skip the above process since downloaded_data_path is already set.

If not, set downloaded_data_path corresponding to your local environment as below:

In [86]:
# Set the path to my Downloads directory
''' 
TODO: Change the file path according to your local environment
'''
downloaded_data_path = '/Users/yoojin/Downloads/'

Load scoring data¶

In [87]:
# Set the path for two files
scoring_input_path = downloaded_data_path + 'scoring_input.npy'
scoring_target_path =  downloaded_data_path + 'scoring_target.npy'

# path to target input
data.input_scoring = np.load(scoring_input_path)
# path to target output
data.target_scoring = np.load(scoring_target_path)
In [114]:
# 실험

input_scoring = np.load(scoring_input_path)
# path to target output
target_scoring = np.load(scoring_target_path)

const_pred_scoring = np.repeat(const_model[np.newaxis, :], target_scoring.shape[0], axis = 0)

X_scoring = input_scoring
bias_vector_scoring = np.ones((X_scoring.shape[0], 1))
X_scoring = np.concatenate((X_scoring, bias_vector_scoring), axis=1)
mlr_pred_scoring = X_scoring@mlr_weights
In [115]:
#실험
input_scoring = np.expand_dims(input_scoring, axis=2)
input_scoring = torch.tensor(input_scoring, dtype=torch.float32).to(device)
target_scoring = torch.tensor(target_scoring, dtype=torch.float32).to(device)
In [116]:
with torch.no_grad():
    pred_scoring = model(input_scoring)

# Append the predictions for the new model to the preds list
preds.append(pred_scoring.cpu().numpy())   
In [117]:
weighted_pred_scoring = pred_scoring
weighted_target_scoring = target_scoring
In [118]:
# Calculate various evaluation metrics.
mae = mean_absolute_error(weighted_target_scoring.cpu().numpy(), weighted_pred_scoring.cpu().numpy())
rmse = np.sqrt(mean_squared_error(weighted_target_scoring.cpu().numpy(), weighted_pred_scoring.cpu().numpy()))
r2 = r2_score(weighted_target_scoring.cpu().numpy(), weighted_pred_scoring.cpu().numpy())
bias = np.mean(weighted_pred_scoring.cpu().numpy() - weighted_target_scoring.cpu().numpy())

metrics = {
    'MAE': mae,
    'RMSE': rmse,
    'R2': r2,
    'Bias': bias
}
In [119]:
# Plot - Only for My Model 
target_scoring = target_scoring.numpy() if hasattr(target_scoring, 'numpy') else target_scoring
pred_scoring = pred_scoring.numpy() if hasattr(pred_scoring, 'numpy') else pred_scoring

group_names = ['dT/dt', 'dq/dt', 'NETSW', 'FLWDS', 'PRECSC', 'PRECC', 'SOLS', 'SOLL', 'SOLSD', 'SOLLD']
group_indices = [range(0,60), range(60,120), range(120,121), range(121,122), range(122,123), range(123,124), range(124,125), range(125,126), range(126,127), range(127,128)]

def compute_metrics(target, pred):
    diff = target - pred
    mae = np.mean(np.abs(diff))
    rmse = np.sqrt(np.mean(diff**2))
    corr_coeff = np.corrcoef(target.flatten(), pred.flatten())[0,1]
    r2 = corr_coeff**2
    bias = np.mean(diff)
    return mae, rmse, r2, bias

group_mae, group_rmse, group_r2, group_bias = [], [], [], []

for indices in group_indices:
    mae, rmse, r2, bias = compute_metrics(target_scoring[:, indices], pred_scoring[:, indices])
    group_mae.append(mae)
    group_rmse.append(rmse)
    group_r2.append(r2)
    group_bias.append(bias)

metrics_dict = {
    'Mean Absolute Error (MAE)': group_mae,
    'Root Mean Square Error (RMSE)': group_rmse,
    'R squared (R2)': group_r2,
    'Bias': group_bias
}

# 각 메트릭에 대한 그림 생성 및 저장
for metric_name, metric_data in metrics_dict.items():
    fig, ax = plt.subplots(figsize=(10, 4))
    ax.bar(group_names, metric_data, color='blue', label='CNN')
    ax.set_title(metric_name)
    ax.legend()
    ax.set_xticks(group_names)
    ax.set_xticklabels(group_names, rotation=45, ha="right")
    plt.tight_layout()
    plt.savefig(f"(Scoring)One_model_output_{metric_name.replace(' ', '_')}.png", dpi=300)
    plt.show() 
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
In [120]:
# Plot - For All three models + save figures
def compute_metrics(target, pred):
    diff = target - pred
    mae = np.mean(np.abs(diff))
    rmse = np.sqrt(np.mean(diff**2))
    corr_coeff = np.corrcoef(target.flatten(), pred.flatten())[0,1]
    bias = np.mean(diff)
    return mae, rmse, corr_coeff**2, bias

group_indices = [list(range(0, 60)), list(range(60, 120)), [120], [121], [122], [123], [124], [125], [126], [127]]
metrics_functions = [compute_metrics]

labels = ['dT/dt', 'dq/dt', 'NETSW', 'FLWDS', 'PRECSC', 'PRECC', 'SOLS', 'SOLL', 'SOLSD', 'SOLLD']
colors = ['blue', 'orange', 'green']

# to save results
all_mae, all_rmse, all_r2, all_bias = [], [], [], []

# calculate metrics for each group
for indices in group_indices:
    metrics_cnn = compute_metrics(target_scoring[:, indices], pred_scoring[:, indices])
    metrics_const = compute_metrics(target_scoring[:, indices], const_pred_scoring[:, indices])
    metrics_mlr = compute_metrics(target_scoring[:, indices], mlr_pred_scoring[:, indices])
    
    all_mae.append([metrics_cnn[0], metrics_const[0], metrics_mlr[0]])
    all_rmse.append([metrics_cnn[1], metrics_const[1], metrics_mlr[1]])
    all_r2.append([metrics_cnn[2], metrics_const[2], metrics_mlr[2]])
    all_bias.append([metrics_cnn[3], metrics_const[3], metrics_mlr[3]])

group_metrics = np.array([all_mae, all_rmse, all_r2, all_bias])  # Convert to numpy array
width = 0.2  # for each bar
positions = np.arange(len(labels))  # each group's position

metric_names = ['Mean Absolute Error (MAE)', 'Root Mean Square Error (RMSE)', 'R squared (R2)', 'Bias']

# Create and save each metric plot separately
for metric_idx, metric_name in enumerate(metric_names):
    fig, ax = plt.subplots(figsize=(10, 4))  # Create a new figure for each metric
    for model_idx, model_name in enumerate(['CNN', 'const', 'mlr']):
        bar_positions = [pos + model_idx * width for pos in positions]
        ax.bar(bar_positions, group_metrics[metric_idx][:, model_idx], color=colors[model_idx], width=width, label=model_name)
    ax.set_title(metric_name)
    ax.legend()
    ax.set_xticks(positions + width)
    ax.set_xticklabels(labels, rotation=45, ha="right")
    plt.tight_layout()
    plt.savefig(f"(Scoring)models_output_plot_{metric_name.replace(' ', '_')}.png", dpi=300)
    plt.show() 
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
/Users/yoojin/opt/anaconda3/envs/env3/lib/python3.9/site-packages/numpy/lib/function_base.py:2897: RuntimeWarning: invalid value encountered in divide
  c /= stddev[:, None]
/Users/yoojin/opt/anaconda3/envs/env3/lib/python3.9/site-packages/numpy/lib/function_base.py:2898: RuntimeWarning: invalid value encountered in divide
  c /= stddev[None, :]
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
In [154]:
display(img_scoring)

Set pressure grid¶

In [92]:
data.set_pressure_grid(data_split = 'scoring')

Load predictions¶

1. Constant prediction model¶

In [93]:
const_pred_scoring = np.repeat(const_model[np.newaxis, :], data.target_scoring.shape[0], axis = 0)
print(const_pred_scoring.shape)
(1681920, 128)

2. Multiple Linear Regression model¶

In [94]:
X_scoring = data.input_scoring
bias_vector_scoring = np.ones((X_scoring.shape[0], 1))
X_scoring = np.concatenate((X_scoring, bias_vector_scoring), axis=1)
mlr_pred_scoring = X_scoring@mlr_weights
print(mlr_pred_scoring.shape)
(1681920, 128)

3. Your pre-trained model¶

Load predictions into data_utils object¶
In [103]:
# Prepare the input and target data.
input_scoring = np.expand_dims(data.input_scoring, axis=2)
input_scoring = torch.tensor(input_scoring, dtype=torch.float32).to(device)
target_scoring = torch.tensor(data.target_scoring, dtype=torch.float32).to(device)
In [104]:
# Use the loaded model to make predictions on the scoring data.
with torch.no_grad():
    pred_scoring = model(input_scoring)

# Append the predictions for the new model to the preds list
preds.append(pred_scoring.cpu().numpy())   
In [105]:
# Save the prediction results for subsequent analysis.
np.save("pred_scoring.npy", pred_scoring.cpu().numpy())

Weight predictions and target¶

  1. Undo output scaling

  2. Weight vertical levels by dp/g

  3. Weight horizontal area of each grid cell by a[x]/mean(a[x])

  4. Convert units to a common energy unit

In [106]:
# This part requires a specific re-weighting method
weighted_pred_scoring = pred_scoring
weighted_target_scoring = target_scoring
In [107]:
# Calculate various evaluation metrics.
mae = mean_absolute_error(weighted_target_scoring.cpu().numpy(), weighted_pred_scoring.cpu().numpy())
rmse = np.sqrt(mean_squared_error(weighted_target_scoring.cpu().numpy(), weighted_pred_scoring.cpu().numpy()))
r2 = r2_score(weighted_target_scoring.cpu().numpy(), weighted_pred_scoring.cpu().numpy())
bias = np.mean(weighted_pred_scoring.cpu().numpy() - weighted_target_scoring.cpu().numpy())

metrics = {
    'MAE': mae,
    'RMSE': rmse,
    'R2': r2,
    'Bias': bias
}

Create plots¶

Plot for our CNN Model

In [108]:
# Plot - Only for My Model 
target_scoring = target_scoring.numpy() if hasattr(target_scoring, 'numpy') else target_scoring
pred_scoring = pred_scoring.numpy() if hasattr(pred_scoring, 'numpy') else pred_scoring

group_names = ['dT/dt', 'dq/dt', 'NETSW', 'FLWDS', 'PRECSC', 'PRECC', 'SOLS', 'SOLL', 'SOLSD', 'SOLLD']
group_indices = [range(0,60), range(60,120), range(120,121), range(121,122), range(122,123), range(123,124), range(124,125), range(125,126), range(126,127), range(127,128)]

def compute_metrics(target, pred):
    diff = target - pred
    mae = np.mean(np.abs(diff))
    rmse = np.sqrt(np.mean(diff**2))
    corr_coeff = np.corrcoef(target.flatten(), pred.flatten())[0,1]
    r2 = corr_coeff**2
    bias = np.mean(diff)
    return mae, rmse, r2, bias

group_mae, group_rmse, group_r2, group_bias = [], [], [], []

for indices in group_indices:
    mae, rmse, r2, bias = compute_metrics(target_scoring[:, indices], pred_scoring[:, indices])
    group_mae.append(mae)
    group_rmse.append(rmse)
    group_r2.append(r2)
    group_bias.append(bias)

metrics_dict = {
    'Mean Absolute Error (MAE)': group_mae,
    'Root Mean Square Error (RMSE)': group_rmse,
    'R squared (R2)': group_r2,
    'Bias': group_bias
}

# save imgs
for metric_name, metric_data in metrics_dict.items():
    fig, ax = plt.subplots(figsize=(10, 4))
    ax.bar(group_names, metric_data, color='blue', label='CNN')
    ax.set_title(metric_name)
    ax.legend()
    ax.set_xticks(group_names)
    ax.set_xticklabels(group_names, rotation=45, ha="right")
    plt.tight_layout()
    plt.savefig(f"(Scoring)One_model_output_{metric_name.replace(' ', '_')}.png", dpi=300)
    plt.show() 
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
In [110]:
# Plot - For All three models + save figures
def compute_metrics(target, pred):
    diff = target - pred
    mae = np.mean(np.abs(diff))
    rmse = np.sqrt(np.mean(diff**2))
    corr_coeff = np.corrcoef(target.flatten(), pred.flatten())[0,1]
    bias = np.mean(diff)
    return mae, rmse, corr_coeff**2, bias

group_indices = [list(range(0, 60)), list(range(60, 120)), [120], [121], [122], [123], [124], [125], [126], [127]]
metrics_functions = [compute_metrics]

labels = ['dT/dt', 'dq/dt', 'NETSW', 'FLWDS', 'PRECSC', 'PRECC', 'SOLS', 'SOLL', 'SOLSD', 'SOLLD']
colors = ['blue', 'orange', 'green']

#
all_mae, all_rmse, all_r2, all_bias = [], [], [], []

# calculate metrics for each group
for indices in group_indices:
    metrics_cnn = compute_metrics(target_scoring[:, indices], pred_scoring[:, indices])
    metrics_const = compute_metrics(target_scoring[:, indices], const_pred_scoring[:, indices])
    metrics_mlr = compute_metrics(target_scoring[:, indices], mlr_pred_scoring[:, indices])
    
    all_mae.append([metrics_cnn[0], metrics_const[0], metrics_mlr[0]])
    all_rmse.append([metrics_cnn[1], metrics_const[1], metrics_mlr[1]])
    all_r2.append([metrics_cnn[2], metrics_const[2], metrics_mlr[2]])
    all_bias.append([metrics_cnn[3], metrics_const[3], metrics_mlr[3]])

group_metrics = np.array([all_mae, all_rmse, all_r2, all_bias])  # Convert to numpy array
width = 0.2  # for each bar
positions = np.arange(len(labels))  # each group's position

metric_names = ['Mean Absolute Error (MAE)', 'Root Mean Square Error (RMSE)', 'R squared (R2)', 'Bias']

# Create and save each metric plot separately
for metric_idx, metric_name in enumerate(metric_names):
    fig, ax = plt.subplots(figsize=(10, 4))  # Create a new figure for each metric
    for model_idx, model_name in enumerate(['CNN', 'const', 'mlr']):
        bar_positions = [pos + model_idx * width for pos in positions]
        ax.bar(bar_positions, group_metrics[metric_idx][:, model_idx], color=colors[model_idx], width=width, label=model_name)
    ax.set_title(metric_name)
    ax.legend()
    ax.set_xticks(positions + width)
    ax.set_xticklabels(labels, rotation=45, ha="right")
    plt.tight_layout()
    plt.savefig(f"(Scoring)models_output_plot_{metric_name.replace(' ', '_')}.png", dpi=300)
    plt.show() 
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
/Users/yoojin/opt/anaconda3/envs/env3/lib/python3.9/site-packages/numpy/lib/function_base.py:2897: RuntimeWarning: invalid value encountered in divide
  c /= stddev[:, None]
/Users/yoojin/opt/anaconda3/envs/env3/lib/python3.9/site-packages/numpy/lib/function_base.py:2898: RuntimeWarning: invalid value encountered in divide
  c /= stddev[None, :]
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
In [155]:
display(img_scoring)

In summary, the CNN model appears to outperform the other two models across all metrics, consistently showing superior results. The mlr model follows suit with relatively good performance, while the const model generally lags behind in terms of performance.